Research on Job Scheduling Algorithm in Hadoop

نویسندگان

  • Yang XIA
  • Lei WANG
  • Qiang ZHAO
  • Gongxuan ZHANG
چکیده

On the basis of researching Fair Scheduling Strategy deeply in Hadoop cluster,the Node Health Degree is defined by constructing the relationship function between node load and job fail rate, and a job scheduling algorithm based on Node Health Degree is proposed in this paper. Nodes are grouped, according to Node Health Degree, into three categories in order to assign corresponding job in accordance with load and guarantee resource load balance. By comparing with FIFO and Fair scheduling algorithm, the simulation results show that this algorithm can ensure to reduce job fail rate and improve cluster throughput.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hadoop Job Scheduling Algorithm Based on Pagerank

Aiming at the problem that the job scheduling algorithm based on the classical model of cloud computing in Hadoop is not high, the new job scheduling algorithm based on PageRank algorithm is proposed, Under the premise of ensuring the user experience, we propose a new job scheduling algorithm named ValidRank, which is based on the combination of hierarchical weight and waiting time. Then for th...

متن کامل

Job Attentive Scheduling Algorithm in Hadoop

In recent years cloud services have gained much attention as a result of their availability, scalability, and low cost. One use of these services has been for the execution of scientific workflows as part of Big Data Analytics, which are employed in a diverse range of fields including astronomy, physics, seismology, and bioinformatics. There has been much research on heuristic scheduling algori...

متن کامل

The Improved Job Scheduling Algorithm of Hadoop Platform

[Abstract] This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a jobs scheduling optimization algorithm based on Bayes Classification viewing the shortcoming of those algorithms which are used. The proposed algorithm can be summarized as follows. In the scheduling algorithm based on Bayes Classification, the jobs in job queue will be classified into bad job and...

متن کامل

Improved Fair Scheduling Algorithm for Hadoop Clustering SNEHA and SHONEY SEbASTIAN

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works li...

متن کامل

Hadoop Scheduling Base On Data Locality

In hadoop, the job scheduling is an independent module, users can design their own job scheduler based on their actual application requirements, thereby meet their specific business needs. Currently, hadoop has three schedulers: FIFO, computing capacity scheduling and fair scheduling policy, all of them are take task allocation strategy that considerate data locality simply. They neither suppor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011